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ABSTRACT 

We present a new method of constraining the mass and velocity anisotropy profiles of 
galaxy clusters from kinematic data. The method is based on a model of the phase 
space density which allows the anisotropy to vary with radius between two asymp- 
totic values. The characteristic scale of transition between these asymptotes is fixed 
and tuned to a typical anisotropy profile resulting from cosmological simulations. The 
model is parametrized by two values of anisotropy, at the centre of the cluster and at 
infinity, and two parameters of the NFW density profile, the scale radius and the scale 
mass. In order to test the performance of the method in reconstructing the true cluster 
parameters we analyze mock kinematic data for 20 relaxed galaxy clusters generated 
from a cosmological simulation of the standard ACDM model. We use Bayesian meth- 
ods of inference and the analysis is carried out following the Markov Chain Monte 
Carlo approach. The parameters of the mass profile are reproduced quite well, but 
we note that the mass is typically underestimated by 15 percent, probably due to 
the presence of small velocity substructures. The constraints on the anisotropy profile 
for a single cluster are in general barely conclusive. Although the central asymptotic 
value is determined accurately, the outer one is subject to significant systematic er- 
rors caused by substructures at large clustercentric distance. The anisotropy profile is 
much better constrained if one performs joint analysis of at least a few clusters. In this 
case it is possible to reproduce the radial variation of the anisotropy over two decades 
in radius inside the virial sphere. 

Key words: galaxies: clusters: general - galaxies: kinematics and dynamics - cos- 
mology: dark matter 



1 INTRODUCTION 

Kinematic data play an important role in dynamical studies 
of galaxy clusters. They offer a unique possibility to con- 
strain simultaneously the total (dark and luminous) mass 
profile and the orbital structure of galaxies which is com- 
monly quantified by the anisotropy of velocity dispersion 
tensor (Binney & Tremaine 2008). The main challenge of 
this approach is the fact that both factors, the mass and 
the anisotropy profile, are interconnected. In the particular 
case of commonly used Jeans analysis of velocity disper- 
sion profile this leads to a well known problem of the mass- 
anisotropy degeneracy that hampers the constraining power 
of the data (e.g. Binney & Mamon 1982; Merritt 1987; Mer- 



rifield & Kent 1990). It turns out that in general there are 
two ways to break this degeneracy: one can either combine 
the results with other constraints on the mass profile or use 
a dynamical model going beyond the Jeans equation. 

The first solution has been widely applied in the litera- 
ture. Promising constraints on the anisotropy of galactic or- 
bits in clusters have been obtained by combining the results 
from velocity dispersion profiles with the mass constraints 
from X-ray gas (e.g. Benatov et al. 2006; Hwang & Lee 2008) 
or lensing data (e.g. Natarajan & Kneib 1996). A different 
approach was adopted by Biviano & Katgert (2004) who 
argued for an isotropic velocity dispersion tensor of early 
type galaxies in clusters from the ESO Nearby Abell Clus- 
ter Survey (ENACS). Assuming this property they inferred 
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the total mass profile from velocity dispersions of ellipticals 
and used this result to constrain the anisotropy profile of 
late type galaxies. This approach was an implementation of 
the anisotropy inversion algorithm (Binney & Mamon 1982; 
Solanes & Salvador-Sole 1990). 

The second method of breaking the mass-anisotropy de- 
generacy relies on the extension of the classical Jeans for- 
malism. The first natural step in this field is to consider the 
fourth velocity moment or kurtosis (Merrifield & Kent 1990) . 
The method based on the joint fitting of velocity moments 
(dispersion and kurtosis) was shown to provide constraints 
both on the mean anisotropy of a system and the parameters 
of the mass profile (Lokas 2002; Lokas & Mamon 2003; San- 
chis, Lokas & Mamon 2004; Lokas et al. 2006). These results 
confirmed the idea that any attempt to infer the anisotropy 
profile from kinematic data of spherical systems must be pre- 
ceded by the construction of a detailed dynamical model. In 
general there are two ways to achieve a desired complex- 
ity of a model. One is the so-called Schwarzschild modelling 
(Schwarzschild 1979) in which one considers a superposition 
of base orbits defined in the integral space (e.g. Merritt & 
Saha 1993; Gerhard et al. 1998; Chaname, Kleyna & van der 
Marel 2008). Another one, which we adopt in this work, is 
to provide a properly parametrized form for the phase space 
density. 

There were several studies devoted to the analysis of 
kinematic data in terms of the phase space density. An im- 
portant step in this field was made by Kent & Gunn (1982) 
who used a family of simple analytical models of the dis- 
tribution function to analyze the data for the Coma clus- 
ter. Van der Marel et al. (2000) obtained constraints on the 
anisotropy of 16 galaxy clusters from the CNOC1 (Cana- 
dian Network for Observational Cosmology) cluster redshift 
survey. A conceptually similar analysis was carried out by 
Mahdavi & Geller (2004) for galaxy groups and clusters. In 
all cases a constant anisotropy was assumed which does not 
reproduce well the results of cosmological simulations where 
the dependence of the anisotropy on radius is usually seen 
(e.g. Mamon & Lokas 2005; Wojtak et al. 2005; Ascasibar 
& Gottlober 2008). Since the anisotropy profile has recently 
become a subject of growing interest, it appears reasonable 
to generalize the above methods so that both the mass and 
the anisotropy profiles may be inferred from the data. This 
implies that one has to deal with an anisotropic model of 
the phase space density which accounts for its radial vari- 
ation. Quite recently several models satisfying this require- 
ment have been proposed. The anisotropy profile is specified 
by a proper parametrization of the angular momentum part 
of the distribution function (Wojtak et al. 2008) or the aug- 
mented density (Van Hese, Baes & Dejonghe 2009). The 
purpose of the present work was to adopt the approach of 
Wojtak et al. (2008) to the Bayesian analysis of kinematic 
data and to test on mock data sets how well the mass and 
anisotropy profiles are reproduced. 

The paper is organized as follows. In the first section we 
introduce a model of the phase space density and discuss its 
projection on to the plane of sky. In section 2 we describe 
mock kinematic data of galaxy clusters generated from a 
cosmological simulation. Section 3 provides technical details 
on the Monte Carlo Markov Chain analysis and section 4 
presents the results. The discussion follows in section 5. 



2 THE PHASE SPACE DENSITY 

Any spherical system in equilibrium embedded in a fixed 
gravitational potential is described completely by the dis- 
tribution function which depends on the phase space co- 
ordinates through the binding energy E and the absolute 
value of the angular momentum L. In this work, we use 
the model of the distribution function recently proposed by 
Wojtak et al. (2008) which was shown to recover spherically 
averaged phase space distribution of dark matter particles 
in simulated cluster-size haloes. The main idea of this ap- 
proach lies in the assumption that the distribution func- 
tion is separable in energy and angular momentum, i.e. 
f(E, L) = /_e(_B)/_l(L). The angular momentum part 
is given by an analytical ansatz motivated by the purpose of 
providing an appropriate parametrization of the anisotropy 
profile, which is traditionally quantified by the so-called 
anisotropy parameter 



P(r) = 1 



*f(r) 
o-2(r) ' 



(1) 



where ag and a r are dispersions of the tangential and radial 
velocity respectively. This part of the distribution function 
takes the following form 
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where j3o and Poo are the asymptotic values of the anisotropy 
parameter at the halo centre and at infinity respectively. The 
scale of transition between these two asymptotes is deter- 
mined by Lo, whereas a typical radial range of the growth or 
decrease of f3(r) is fixed at about 2 orders of magnitude cen- 
tred on the radius corresponding to Lq (see the anisotropy 
profiles in the top right panel of Fig. [TJ. Although some re- 
cent models of the distribution function offer a little more 
flexible parametrization of the anisotropy profile (e.g. Baes 
& Van Hese 2007; Van Hese et al. 2009), we find that our 
choice is quite suitable for the purpose of this work, given 
that we wish to reproduce the variability of (5{r) with as few 
parameters as possible. 

The energy part of the distribution function fs{E) is 
given by the solution of the integral equation 

r2 \ -/3oo+/3 
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This equation can be simplified to the one-dimensional in- 
tegral equation and then solved numerically for /e (see Ap- 
pendix B in Wojtak et al. 2008). As an approximation for 
the density profile in (0 we use the NFW profile (Navarro, 
Frenk & White 1997), i.e. 

D ( r l r \ = I I Ml (A) 

PK ' ' 4ir(ln2-l/2) (r/r s )(l + r/r s ) 2 rf ' K ' 

where r s is the scale radius and M s is the mass enclosed in 
a sphere of this radius. Both parameters of the mass pro- 
file provide natural scales of the phase space coordinates so 
that any change of M s or r s corresponds to the expansion 
or contraction of a system in the space of velocities or the 
positions. For the sake of convenience, we use the scaling 
which fixes the range of positively defined binding energy 
per unit mass E at [0, 1], namely r s as the unit of radius 
and V s = (GM s /r s ) 1/2 (ln2 - 1/2)^ 1/2 as the unit of veloc- 
ity (see Wojtak et al. 2008). 
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Due to projection effects, a fraction of the phase space 
is not accessible to observation. An observer is able to mea- 
sure velocity along the line of sight vi os and the position on 
the sky which can be easily translated into the projected 
clustercentric distance R. This data set, when plotted vi OB 
versus R, is commonly referred to as the phase space dia- 
gram or the velocity diagram. Data points in such diagrams 
are distributed according to the projected phase space den- 
sity which is given by (e.g. Dejonghe & Merritt 1992; Merritt 
& Saha 1993; Mahdavi & Geller 2004) 

fi os (R,vi os ) = 2tvR [ ""dz / / d«Hdw / B (£)/z,(L), (5) 

J-z m , J J E>0 

where z is the distance along the line sight from the cluster 
centre, while vr and V4, are velocity components in cylindri- 
cal coordinates. We will hereafter refer to f\ OB (R, vi oa ) as the 
phase space density. The integration area of the last two in- 
tegrals is given by the circle v R + < 2*l/(v / R 2 + z 2 ) — vf os , 
where ^(r) is a positively defined gravitational potential 
for the NFW density profile, *(r) = V 2 Ln(l + r/r s )/(r/r s ) 
(Cole & Lacey 1996). The boundaries of the integral along 
the line of sight are given by the distance ±z max at which 
v\o B {R) becomes the escape velocity. It is worth mentioning 
that in ((5]) we assume that the density profiles of a tracer 
(galaxies) and dark matter are proportional to each other. 
Second, /i DS is defined up to the normalization which must 
therefore be imposed by the following additional condition 

2/ dR fi os (R,v loa )dvi os = 1, (6) 

Jo Jo 

where 7? max is a cut-off radius of the phase space diagram. 
From ((6]) one can immediately infer that the normalization 
factor is {r 3 T/ s [ A ^ios(-Rmax)/M s ]}~ 1 , where M los (R max ) is the 
projected mass within the aperture i? ma x for the NFW den- 
sity profile (see Lokas & Mamon 2001 for an analytical ex- 
pression). From now on when referring to the phase space 
density /i os we will always mean that it is properly normal- 
ized as in (B). 

We calculate the phase space density (0 numerically us- 
ing the algorithm of Gaussian quadrature to evaluate each 
integral (Press et al. 1996). The energy part of the distribu- 
tion function fs (E) is interpolated between points which are 
equally spaced in energy and provide the numerical solution 
of equation ((3]). In order to remove improper boundaries of 
the integral along the line of sight for v\ os = we changed 
the variable z into ty. Next, to avoid singularity along the 
line L 2 = (R 2 + z 2 )v\ + (v R z - vi OB R) 2 = 0, when f3 > 0, 
we used an even number of abscissas for the variable v^, so 
that /i(i) is never evaluated at L = 0. 

Fig. [1] shows the contour maps of phase space density 
f\ OB (R, vi OB ) for 5 different anisotropy profiles plotted in the 
top right panel. In all cases we used i? max = 5r 3 , a typical 
virial radius of massive galaxy clusters, and the scale of tran- 
sition between /3o and (3^, Lo — 0.2V s r s , which corresponds 
to ~ lr s . To facilitate comparison, in the four bottom panels 
we plot the differences between the given and the isotropic 
(/3o = Poo = 0) /i os . The contours below the 0.01 threshold 
are neglected as insignificant for the typical number of data 
points considered in this work. 

The results shown in Fig. \T\ are quite intuitive to inter- 
pret. We notice that /3o modifies /i os in the central part of 
the diagram in such a way that radially biased models pre- 
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R/r s R/r s 

Figure 1. The projected phase space density fi OB (R, v\ os ) for five 
anisotropy profiles characterized by different combinations of fio 
and (3^2 and plotted in the top right panel. The top left panel is 
a contour map of the isotropic /i os (/3o = /3oo = 0) and the four 
bottom panels show the differences between a given /i os and the 
isotropic one. The solid lines labelled by v(R) = ^J2^(R) are the 
profiles of the maximum escape velocity. 

diet more high velocity particles than tangentially biased 
ones (see the middle panels). On the other hand, /3oo influ- 
ences the outer part of a diagram so that the tangentially 
biased anisotropy suppresses /i os at v\ OB = and increases it 
for moderate velocities (see the bottom panels). In the limit 
of j3oo -C the velocity distribution at R/r a 3> 1 takes the 
characteristic horn-like shape with a minimum at v\ OB = 
and a maximum at the circular velocity ±[GM(R)/R] 1 ^ 2 
(see van der Marel &i Franx 1993 for a qualitative com- 
parison). As a final remark on Fig. [T] let us note that the 
parameters of the mass profile, M s and r s , manifest them- 
selves only in the scaling of the axes of the diagram while 
preserving the shape of the phase space density. This prop- 
erty, when put together with the way how /i os depends on 
the anisotropy, automatically excludes the mass-anisotropy 
degeneracy that is an intrinsic problem of the analysis based 
on the Jeans equation (see e.g. Merrifield & Kent 1990). 



3 THE MOCK CATALOGUE OF PHASE 
SPACE DIAGRAMS 

The model of the phase space density introduced in the pre- 
vious section provides an idealized description of the real 
data. It neglects various secondary effects that occur in 
real galaxy clusters and perturb their phase space diagrams. 
Among those the most important seem to be: the breaking 
of spherical symmetry, the presence of substructures, the 
finite size of equilibrated zone and filamentary structures 
surrounding the clusters and infalling towards them. In or- 
der to study the impact of such effects on the reliability of 
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Figure 2. The anisotropy profiles in 20 dark matter haloes se- 
lected from the simulation. The two panels show separately the 
profiles for a subsample of ten haloes with more steeply (left) and 
less steeply (right) increasing profiles. 



our approach to data analysis we need to construct a mock 
catalogue of phase space diagrams which on the one hand 
resembles the real spectroscopic cluster survey and on the 
other provides the true mass and anisotropy profiles. This 
can be easily achieved with the use of cosmological simula- 
tions. 

We used an JV-body cosmological simulation of a 
ACDM model. The simulation was carried out in a box of 
160/i _1 Mpc with WMAP3 cosmological parameters (Spergel 
et al. 2007): fl m = 0.24, Qa = 0.76 and the dimensionless 
Hubble constant h — 0.73 (for more details see Wojtak et al. 
2008). We identified all cluster-size haloes at redshift z — 
and selected 20 of them that appeared not to be products of 
recent major mergers. For each halo we measured the virial 
mass M v and the virial radius r v which are defined in terms 
of the mean density within the virial sphere by the following 
equation 



M v 



= A c p c , 



(T) 



(4/3)7rrg 

where A c is the so-called overdensity parameter. For cos- 
mological model under consideration we adopted A c = 94 
(see e.g. Lokas & Hoffman 2001). The virial mass of selected 
haloes is in the range (1.5 — 20) x 10 14 Mq. The minimum 
number of particles inside the virial sphere is 4.2 x 10 . 

By fitting the NFW formula © to the density profile 
calculated in radial bins equally spaced in logarithmic scale 
we determined the scale radius r s of each halo and then 
the concentration parameter c = r v /r s . The scale mass M s 
was measured directly as the mass enclosed by the sphere of 
radius r s . We found that the mass profile extrapolated from 
r 3 outward to the virial sphere recovers the virial mass with 
the precision of a few percent. 

Fig. [2] shows the anisotropy profiles in selected dark 
matter haloes. The profiles were measured from velocity dis- 
persions of dark matter particles inside thin spherical shells 
of radii extending to l.5r v . Although a general trend for 
radial variation of the anisotropy is very clear, the profiles 
are considerably scattered. One can distinguish at least two 
classes of anisotropy profiles: steeply rising and flat or mod- 
erately rising. According to this criterion we divided our halo 
sample into two equally numerous groups of ten haloes each 
(see the left and right panel of Fig. [2}. The reason for this 
division will become clear in section 5, where we consider 
constraints on the anisotropy profile from the joint analysis 
of many phase space diagrams. 

In order to construct phase space diagrams associated 
with the haloes we pick a line of sight and find all particles 



inside a cylinder of observation. The selection is restricted to 
the particles with the projected halocentric distances R < r v 
and velocities along the line of sight v\ os (with the Hubble 
flow included) in the range ±4000 km s~ in the rest frame 
of a halo, which is the commonly used and sufficiently con- 
servative velocity cut-off. The final phase space diagrams 
are generated by drawing randomly 300 particles for each 
halo, where we assume, to make the scheme unambiguous, 
that the tracer is given simply by the particles. The number 
of data points is fixed at a typical number of spectroscopic 
redshifts available for a nearby galaxy cluster with the same 
criterion of velocity cut-off. The selection of particles is done 
three times for three lines of sight chosen to be parallel to 
the Cartesian axes of the simulation box. Thus we effectively 
get three independent sets of phase space diagrams for the 
same sample of 20 haloes. 

Due to projection effects, the phase space diagrams in- 
clude the so-called interlopers: the particles (galaxies) of 
background or foreground which are not physically con- 
nected to the haloes (clusters) . It is commonly accepted that 
any dynamical analysis should be necessarily preceded by 
an identification and removal of as many of these objects as 
possible. In this work we apply the procedure proposed by 
den Hartog & Katgert (1996) which appears to be one of the 
most effective algorithms for interloper removal (Wojtak et 
al. 2007; Wojtak & Lokas 2007) . The method is based on an 
iterative rejection of galaxies with velocity exceeding a max- 
imum velocity which is a function of the clustercentric dis- 
tance R. The maximum velocity is calculated using a model 
which allows the galaxies to be either on circular orbit with 
velocity v c i T = \J GM(r)/r or to fall towards the cluster cen- 
tre with velocity v2i>cir, where M(r) is approximated in each 
iteration by the virial mass estimator (Heisler, Tremaine & 
Bahcall 1985). The phase space diagrams preprocessed in 
such a way are suitable for the proper analysis in terms of 
the phase space density. Hereafter, when referring to the 
phase space diagram we will always mean the diagram after 
interloper removal. 



4 BAYESIAN ANALYSIS AND MCMC 

Our aim is to constrain the parameters of the mass and 
anisotropy profiles by analyzing the mock phase space dia- 
grams generated from the simulation. Following the princi- 
ple of Bayesian inference our knowledge of model parameters 
a given the data D may be expressed as the posterior proba- 
bility p(a\D) which is related to the probability of obtaining 
the data in a given model p(D\a) (likelihood) and the prior 
probability p(a) via equation 



p(a\D) = 



p(a)p(D\a) 
p(D) 



(8) 



where a is a vector of model parameters and D stands for 
the data. The probability p(D) plays the role of the nor- 
malization coefficient and can be neglected without loss of 
generality. 

The data D of a phase space diagram consists of N 
points located at (R4, «i os ,i) ; where i = 1, . . . , N. Assuming 
that the number and mass density are proportional to each 
other the likelihood is given by 
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N 

,v\ os ,i\a), (9) 

where /i oa is the phase space density given by ([5]) and 
a — {M 3 ,r s , /3o, /3oo, Lo}. It is worth mentioning that this 
formula does not take into account statistical errors of vi OB 
and R. This is motivated by the fact that the errors of spec- 
troscopic redshift measurements for nearby clusters are quite 
small and have negligible impact on the results of the anal- 
ysis (see e.g. van der Marel et al. 2000). 

4.1 Priors and reparametrization 

The two parameters of the mass profile, M s and r s , are the 
scale parameters. The most appropriate prior probability 
for such parameters is the Jeffreys prior which represents 
equal probability per decade (Jeffreys 1946), i.e. p(M s ) oc 
1/M S and p(r B ) oc l/r s . We will use this kind of prior in the 
analysis of any single phase space diagram. The situation 
will change in subsection 5.3 where we consider joint analysis 
of many phase space diagrams. 

The commonly used anisotropy parameter f3(r) gives 
rise to unequal weights for tangentially and radially biased 
models. In order to make the prior probability for both 
types of anisotropy equal we introduce, following Wilkinson 
et al. (2002) and Mahdavi & Geller (2004), the following 
reparametrization 

In^g = _Im_ / 3( r )]. (10) 
0-9 (r) 2 

Using a uniform prior for \n(o r / ae)o = — l/21n(l — /3o) and 
ln(ov/o"e)oo = — l/21n(l — /3<x>) we put equal weights to all 
types of anisotropy. The ranges of both priors are limited by 
A) < 1/2 and /3oo < 0.99. The first condition is motivated by 
the requirement of the distribution function to be positive 
(see An & Evans 2006), whereas the second one allows us 
to avoid improper posterior distribution that is important 
when the data favour /3oo < 1. 

The scale of transition between the two asymptotic val- 
ues of anisotropy is defined by Lo- We find that keeping 
this parameter free causes degeneracy with /3oo and /3o- This 
considerably restricts the information on the growth or de- 
crease of the anisotropy profile that f3o and /3oo are expected 
to convey. To overcome this difficulty we fix Lq at 0.2V s r a , 
a value corresponding to the ~ lr s transition scale expected 
from cosmological simulations (see Wojtak et al. 2008). This 
implies that the anisotropy profiles approach their asymp- 
totic values at some characteristic scales of radius, namely 
Po — f3(r < 0.1r s ) and /3oc — (3(r > 10r s ). Fixing Lo reduces 
the dimension of the parameter space by one so that the 
analysis is based on the four-parameter phase space model 
specified by M B , r s , j3o and Poo- 

4.2 MCMC approach 

Our purpose is to calculate the posterior probability den- 
sity and provide credibility regions in the parameter space. 
This involves numerical integrations of multivariate semi- 
Gaussian functions that can be efficiently tackled with the 
Markov Chain Monte Carlo (MCMC) algorithm. The main 
idea of this approach is to construct a sufficiently long chain 



of models which are distributed in the parameter space ac- 
cording to the posterior probability density. Once such a 
chain is provided, one can easily compute the marginal prob- 
ability distributions by projecting all points on to an appro- 
priate parameter subspace and evaluating the histograms. 

We construct the Markov chains following the 
Metropolis-Hastings algorithm (see e.g. Gregory 2005). In 
the first step of this algorithm one picks a trial point in 
parameter space at+i by drawing from the so-called pro- 
posal distribution q(a\at) centred on the previous point of 
the chain at. Then the point at+i is accepted with the prob- 
ability equal to min{l, r}, where r is the so-called Metropolis 
ratio given by 

r _ p(at+i\D)q(at\at+i) ^ 
p(at\D)q(a t+ i\a t ) 

otherwise it takes the value of its predecessor, at+i = at. 
In the case of a few parameter model the proposal proba- 
bility density can be any function that roughly resembles 
the target posterior distribution. In our work we use a mul- 
tivariate Gaussian with a diagonal covariance matrix. The 
variances are calculated from short trial Markov chains of 
about 2000 models, where our best initial guess for parame- 
ter variances is used. This approach is a simplified version of 
the scheme outlined in Widrow, Pym & Dubinski (2008). It 
is worth mentioning that the proposal distribution is sym- 
metric, q(at+i\at) — q(at\a t +i), so that the Metropolis ratio 
is given by p(a t+ i\D)/p(at\D). 

An important issue of the MCMC analysis is a rela- 
tive number of distinct points in the parameter space which 
is described by the so-called acceptance rate (the average 
rate at which proposed models are accepted). A recom- 
mended value of this parameter for many-parameter models 
is around 20 — 30 percent (see e.g. Gregory 2005). We note 
that our choice of the proposal distribution keeps the accep- 
tance rates of the resulting Markov chains within this range. 
Some additional modification of the proposal distribution is 
required in subsection 5.3, where we consider a joint analysis 
involving N p = 22 parameters. In order to keep the accep- 
tance rate at a desirable level all initial variances of the pro- 
posal distribution are scaled by 2.4 2 jN v (see e.g. Gelman et 
al. 1995) . This correction prevents the acceptance rate from 
dropping, due to a large number of parameters in use, to a 
very low value of around 1 percent. 

When applying the MCMC algorithm, it is crucial to 
check whether the chains explore properly the parameter 
space, i.e. whether they possess the property often referred 
to as mixing. To save computing time, we decided not to 
create several chains for each data sample that is a com- 
monly advised way to look for convergence. As an alterna- 
tive, we follow two simple indicators of mixing. First, we cal- 
culate parameter dispersions in two halves of a given chain 
and within the whole chain. If the relative differences be- 
tween them do not exceed 10 percent for each parameter, 
the chain is considered to be mixed. Second, we monitor the 
variation of the posterior probability along the chains. Un- 
less the profiles of p(a\D) exhibit a long-scale tendency to 
grow or decline, the chain is again expected to be mixed. We 
note that the so-called burn-in part of the chains, when the 
first models gradually approach the most favoured zone of 
the parameter space, is not longer than 1 percent of the total 
length of the chains. All chains used in this analysis consist 



© 0000 RAS, MNRAS 000, 000-000 



6 



R. Wojtak et al. 



-1.6 



-1.0 



Po 

-0.4 -0.2 0.0 



0.2 



0.4 



0.6 



1.5 



1.4 



1.3 



1.2 



1.0 
0.9 
0.8 
0.7 



0.8 1.0 
(<V<%>0 



1.2 



0.98 

0.95 
0.9 

8 

0.8 

0.6 
0.4 

0.0 
-0.4 



1.4 



; — //& 






i i i i i i 


i i i i i i i i i 



0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 



Figure 3. The la credibility regions inferred from the MCMC 
analysis of three theoretical phase space diagrams generated from 
the distribution function with /3q = 0.1 and /3oo = 0.5. The pa- 
rameters of the mass profile were rescaled by their true values. 
Solid, dashed and dotted lines correspond to the diagrams with 
9000, 3000 and 300 data points. Straight lines indicate the true 
values of the parameters. 



of 10 4 models that is above the recommended minimum of 
330iV p (Dunkley et al. 2005). 
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Figure 4. Projected phase space diagram obtained from the 
full phase space distribution function with anisotropy parameters 
Po = 0.1 and /3oo = 0.5. Solid and dashed lines show respec- 
tively the isodensity contours of the best-fitting model and the 
smoothed surface density of 9000 sampling points. Two envelope 
lines labelled by v(R) = ±^/2\&(il) are the profiles of the escape 
velocity. 



on top of the smoothed contour map of the phase space den- 
sity (dashed lines) in Fig. [4] From Fig. [3] we conclude that 
typical relative errors of M s and r s from the analysis of a sin- 
gle phase space diagram with N = 300 points are about 20 
percent. On the other hand, the corresponding constraints 
on the anisotropy parameters are very poor. The marginal 
distribution is so wide, particularly along /3oo, that it is ex- 
pected to be sensitive to any kind of noise in the real data. 
Anticipating the results of the following section we empha- 
size that the only way to reliably constrain the anisotropy 
profile is to increase the number of data points. In practice, 
this can be achieved in the joint analysis of at least 10 inde- 
pendent diagrams (3000 data points), where one assumes a 
universal anisotropy profile. The result for TV = 3000, 9000 
in the top panel of Fig. [3] shows what we can expect from 
such analysis. 



4.3 Number of data points 

One of the leading factors affecting the posterior probabil- 
ity is the limited number of data points. In order to study 
the impact of this effect we carried out the analysis of three 
theoretical phase space diagrams with 300, 3000 and 9000 
points that correspond to a typical size of a data sample for 
a single nearby galaxy cluster (the first number) and to a 
compilation of 10 — 30 of them (the last two numbers). The 
diagrams were generated from a discrete representation of 
the full phase space distribution function with the following 
parameters of the anisotropy profile: /3o = 0.1, /3oo — 0.5 
and Lo = 0.2V s r s . The phase space was sampled using the 
acceptance-rejection technique (Press et al. 1996) in a man- 
ner described by Kazantzidis, Magorrian & Moore (2004). 

Fig. |3] shows the la credibility regions, i.e. the regions 
enclosing 68.3 percent of the corresponding marginal proba- 
bility, inferred from the MCMC analysis of the three theoret- 
ical diagrams. For the sake of simplicity, the parameters of 
the mass profile were rescaled by the true values. The best- 
fitting model for TV = 9000 case is overplotted in solid lines 



5 RESULTS OF THE ANALYSIS 

We carried out the MCMC analysis of 60 phase space di- 
agrams of our mock data catalogue described in section 3. 
The results are presented in the form of the maximum a pos- 
teriori (MAP) values of the parameters and the errors that 
correspond to the boundary of the lcr credibility regions of 
the marginal distributions, i.e. the regions enclosing 68.3 
percent of the marginal probability. 

5.1 Mass profile 

Fig. [5] shows the results for the parameters of the mass pro- 
file, M a and r s . From the diagrams showing residuals, at- 
tached below the mains plots, we conclude that in half of 
the cases the relative errors do not exceed 25 percent (see 
the lines of the first and third quartile of the residuals). On 
the other hand, around 15 percent of the results differ from 
the real values by more than 50 percent that can hardly be 
reconciled with the expectations from the effect of a shot 
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Figure 5. Mass profile parameters obtained from the MCMC 
fits of 20 individual simulated clusters observed in 3 directions. 
The symbols indicate MAP (maximum a posteriori) values and 
the errors correspond to the boundary of the la credibility region 
of the marginal distributions. Filled circles, asterisks and empty 
circles refer to directions of projection along the x, y and z axis 
of the simulation box. Solid broken lines indicate the true values 
of the parameters. The three lines in the lower panels showing 
fractional residuals from the true parameter values indicate the 
quartiles of a total set of 60 residuals. The clusters are ordered 
by decreasing scale mass M s . 



noise. Analyzing a few tens of theoretical phase space di- 
agrams from subsection 4.3 with N = 300 data points we 
find that residuals induced by a shot noise do not exceed 
30 percent. This may suggest that the most outlying points 
in Fig. [5] are probably subject to non-negligible systematic 
errors. Potential sources of these errors could include projec- 
tion effects as well as those associated with internal struc- 
ture. We looked for the influence of the shape of dark matter 
haloes and found no correlation between the accuracy of pa- 



Figure 6. Same as Fig. [5] with the scale mass and radius con- 
verted to the virial mass M v and concentration c. The clusters 
are ordered by decreasing virial mass. 

rameter estimates and the halo ellipticity or the alignment 
of the major axis with respect to the line of sight. 

All Markov chains can be easily transformed into the 
chains of models parametrized by the virial mass M v and 
the concentration c. The resulting credibility ranges and the 
MAP values of these parameters are shown in Fig. EI We find 
that the scatter in residuals is comparable to the case where 
the scale parameters were used and the positions of the most 
discrepant points imply the presence of similar systematic 
errors. There is noticeable progress in the constraints on 
the concentration parameter in comparison to previous work 
based on modelling velocity moments (Sanchis et al. 2004; 
Lokas et al. 2006). On the other hand, there is no convincing 
evidence for a similar improvement for the virial mass. 

The MAP values of the mass parameters typically un- 
derestimate the true values by 15 percent and 11 percent for 
the virial and scale mass respectively (see the position of me- 
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Figure 7. Prospects of reconstructing the mass-concentration 
relation, based on our MCMC analysis of 20 individual simulated 
clusters observed in 3 directions labelled by x, y and z (from 
the top to the bottom panel). The solid lines are the power-law 
fits to the best-fitting parameters obtained from the analysis of 
the phase space diagrams (filled circles). Open circles indicate 
the true values of the parameters for our 20 clusters. The dashed 
line repeated in each panel is a power-law fit to the mean mass- 
concentration relation for WMAP3 cosmology from Maccio et al. 
(2008). 



dian lines in the diagrams showing residuals in Figs. [6] and 
[5|. Interestingly, a similar offset was reported by Biviano 
et al. (2006) for the virial mass estimated from the Jeans 
analysis of velocity dispersion profiles. They suggested that 
this bias is related to the presence of interlopers infalling 
towards the cluster along filaments. Due to relatively small 



velocities in the rest frame of a cluster, these objects cannot 
be identified by any algorithm of interloper removal and, 
therefore, remain in the sample decreasing the velocity dis- 
persion and the resulting virial mass. We think that a similar 
mechanism is likely responsible for the offset of our results. 
Nevertheless, we emphasize that this bias is smaller than the 
statistical errors obtained in the MCMC analysis so that the 
overall effect is not statistically significant. 

It is a well known fact the concentration parameter is 
weakly correlated to the virial mass (e.g. Navarro et al. 1997; 
Bullock et al. 2001). This so-called mass-concentration rela- 
tion is an imprint of the formation history and is therefore 
thought to be one of the predictions of the cosmological 
model. Fig. [7] demonstrates the prospects of recovering this 
relation with our approach. Each panel in this Figure shows 
the results obtained for 20 phase space diagrams for a given 
direction of observation (filled circles with error bars) and 
the best power-law fit (solid line). For comparison with the 
prediction of the ACDM model, we plot the true parameters 
of our 20 clusters (open circles) and with a dashed line the 
power-law fit to the mean mass-concentration relation for 
relaxed haloes simulated in the framework of WMAP3 cos- 
mology from Maccio, Dutton & van den Bosch (2008). We 
find that, within errors, the empirical M v -c relation is con- 
sistent with a flat profile in all cases. For one projection (top 
panel) the slope deviates from the expected value of —0.1 
by more than two sigma. The normalization within the mass 
range under consideration does not exhibit any offset with 
respect to the prediction so that future determination of this 
quantity for real clusters seems to be feasible. This result is 
particularly important in the context of recently discussed 
inconsistency between the normalization from the simula- 
tions and observational constraints (see e.g. Comerford & 
Natarajan 2007). 

5.2 The anisotropy profile 

Fig. [8] shows the results of the MCMC analysis for the 
anisotropy parameters. Since the phase space model is well- 
defined only within the virial sphere, we decided to replace 
the parameter Poo with /3s rs measured at 5r s which is the 
typical scale of the virial radius. This allows us to avoid 
extrapolation of the model beyond the virial sphere. For 
comparison the Figure also shows in gray boxes the range 
of anisotropy values at 0.1r s and 5r s in two samples of clus- 
ters introduced is section 3 which differ in steepness of their 
anisotropy profiles. 

The majority of results for f3o are consistent with the 
true values. The errors are quite large, however, so they 
would not allow to distinguish between radially and tangen- 
tially biased anisotropy. We emphasize that all la credibility 
ranges are clearly detached from the upper boundary of the 
prior probability, (5 — 0.5. This means that this limit is not 
artificial but is embedded in the phase space structure of an 
equilibrated system. 

The anisotropy at 5r s is poorly constrained. Although 
there is a weak tendency for the results to cluster around the 
true values, most of them are shifted considerably upwards 
or downwards. An important circumstance responsible for 
this bias is the fact that the posterior probability distribu- 
tion is so wide in the space of anisotropy parameters that 
it is sensitive to any irregularities in the phase space dia- 
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Figure 8. Anisotropy profile parameters obtained from the 
MCMC fits of 20 individual simulated clusters observed in 3 di- 
rections. Gray boxes indicate the range of the true anisotropy at 
0.1r s (top panel) and 5r 3 (bottom panel). The first and second 
ten clusters correspond to the samples with fiat to moderately ris- 
ing and steeply rising anisotropy profiles respectively, as shown 
in Fig. [2] The meaning of all symbols is the same as in Fig. \E\ 



gram. These irregularities often occur in the outermost part 
of the diagrams, mostly due to subclustering and the pres- 
ence of small velocity interlopers from the outside of the 
virial sphere. The overall consequence is that the posterior 
probability is often perturbed in /3oo (or j3$ rB ) and typically 
stable in /3rj. Due to significant systematic errors of j3$ rB (or 
Poo), the constraints on the radial variation of the anisotropy 
for a single galaxy cluster are very weak. Most probably this 
caveat cannot be avoided in any approach aiming to deter- 
mine the anisotropy profile of galaxies in a cluster (see e.g. 
Hwang & Lee 2008). 



5.3 The anisotropy profile from the joint analysis 

The only possibility to obtain reliable constraints on the 
anisotropy profile is to increase the number of data points 
used in the analysis. In subsection 4.3 we showed that a 
sample of 3000 data points is expected to provide conclusive 
results in this matter. When dealing with galaxy clusters, 
such increase in the amount of data can only be achieved 
at present by means of the joint analysis of several clusters 
(~ 10 clusters). An additional advantage of this approach 
is that all local irregularities of the phase space diagrams 
compensate each other minimizing systematic errors. 

A common approach to carry out a joint analysis of 
several clusters consists of two stages. First, one merges all 
phases space diagrams with properly rescaled radii R and 




Figure 9. Anisotropy profile parameters obtained from the joint 
MCMC fits of 10 simulated clusters with rapidly (top panel) and 
moderately (bottom panel) rising anisotropy profiles. The con- 
tours indicate the boundaries of the la credibility regions with 
solid, dashed and dotted lines referring to the projection along 
the x, y and z axis respectively. The corresponding maximum a 
posteriori (MAP) values are indicated with letters x, y and z. The 
straight solid line represents a family of fiat anisotropy profiles. 



velocities ui os . As the units of phase space coordinates one 
most often uses the virial radius r v and the so-called virial 
velocity V v = (GM v /r v ) 1/2 . Then, assuming homology of 
the cluster sample, the resulting phase space diagram is re- 
analyzed in the framework of a model reduced by the pa- 
rameters that were used to scale the radii and velocities. 
The final constraints obtained in this approach are thought 
to represent typical properties of galaxy clusters in a given 
sample (e.g. van der Marel et al. 2000; Mahdavi & Geller 
2004; Lokas et al. 2006). 

Our approach differs from the above scheme in both re- 
spects. First, instead of merging many phase space diagrams 
we introduce the following generalization of the likelihood 
function ((9]) for the joint analysis of n phase space diagrams 

f(M sA ,...,M s oo J — 

n N 

II 1J. /:.,•:/.',.. |.\/... :./•_.,. >\ }), (12) 

j=l i=l 

where j and i are respectively the reference number of a 
cluster and a data point of j-th phase space diagram. The 
likelihood as well as the posterior probability distribution 
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Figure 10. Anisotropy profile obtained from the joint MCMC 
fits of 10 simulated clusters with rapidly (left panels) and moder- 
ately (right panels) rising anisotropy profiles, in three directions 
of observation (rows) . The widths of the shaded regions are given 
by the lcr credibility range of the anisotropy at a given radius. 
Solid lines are the true anisotropy profiles measured from the 
simulations. 

are functions of 2n + 2 parameters (the parameter Lo is 
fixed as discussed in subsection 4.1). We assume that the 
two parameters of the anisotropy profile are common to all 
clusters so that the final result concerning both of them is 
expected to provide a typical variation of the anisotropy in 
a given cluster sample. Our knowledge of M s ,i and r Sti ob- 
tained in the previous analysis (see Fig. [5} is incorporated in 
the prior probability. For the sake of maintaining the analyt- 
ical form of the prior, we used the product of 2n Gaussian 
distributions centred at the MAP values of M s ,i and r s ,i. 
The dispersion of each Gaussian was fixed at the disper- 
sion of the corresponding marginal probability distribution. 
It is interesting to note that if we arbitrarily ignored the 
credibility region of the nuisance parameters M s ,i and r s ,i, 
our approach would be conceptually very similar to the case 
when all phase space diagrams are merged except that radii 
and velocities would be scaled by r s and V s . 

We analyzed two samples of clusters, with n = 10 ob- 
jects each, observed in three directions as described in sec- 
tion 3. The samples separate the objects with distinctly ris- 
ing anisotropy profiles from those with flat or moderately 
rising ones. This provides an opportunity to verify whether 
our approach is able to distinguish between these two cases. 
Fig. [9] shows the la credibility regions in the 0o — /3oo plane 
inferred from the Markov chains of 10 4 models. First, we 
notice that all results are consistent with the velocity dis- 
persion tensor that is isotropic in the cluster centre and radi- 
ally biased outside. Second, the resulting anisotropy profiles 
are clearly steeper for clusters with rising anisotropy profiles 
(top panel) and flatter for the others (bottom panel). 

In order to check the radial variation of the anisotropy 



in Fig. [10] we plot the la credibility regions of its radial 
profiles. The resulting anisotropy effectively traces the true 
profiles (solid lines) within the radial range covering more 
than two orders of magnitude around r s . A local deviation 
from the true values occurs only once (middle right panel 
for r < 0.5r s ) as a result of relatively low quality of this 
data sample (see the results shown with asterisks in Fig. [8j) . 
Since the constraints on /3o are considerably tighter than on 
/3oo, statistical errors of local anisotropy typically increase 
with radius. Nevertheless, it is still possible to distinguish 
between steeply rising (left panels) and flat (right panels) 
profiles. Note that the accuracy achieved in this analysis is 
close to the upper limit, since statistical errors become com- 
parable to the internal dispersion of the anisotropy profiles 
in the cluster samples (see e.g. the bottom right panel of 
Fig. 1 10ft . Still, one may still expect to improve the results 
and minimize systematic errors by adding more clusters. 



6 DISCUSSION 

We have performed the Bayesian analysis of mock phase 
space diagrams of galaxy clusters in terms of a fully 
anisotropic model of the phase space density. Our approach 
allows to constrain parameters of the total mass profile, M s 
and r s , as well as the asymptotic values of the anisotropy 
profile, /3o and /3oo . The phase space model was designed to 
detect the radial variation of the anisotropy profile within 
a fixed distance range covering two orders of magnitude in 
radius around r s . The choice of this radial range is moti- 
vated by the results from cosmological simulations (Wojtak 
et al. 2008) and our goal to avoid degeneracy between radial 
scales and the asymptotes of the anisotropy. 

Parameters of the mass profile are determined in our 
approach with rather satisfying average accuracy of about 
30 percent. On the other hand, around 15 percent of the 
results are probably subject to systematic error and differ 
from the true values by 50 — 90 percent. Nevertheless, it is 
worth noting that the typical relative errors of 30 percent in 
the virial mass are comparable to the potential accuracy of 
mass determination from other methods, such as modelling 
of X-ray gas (e.g. Nagai, Vikhlinin & Kravtsov 2007), anal- 
ysis of velocity moments (Sanchis, Lokas & Mamon 2004) 
or the standard virial mass estimator (e.g. Biviano et al. 
2006). The constraints on the concentration parameter are 
however more reliable and tighter in comparison to the ap- 
proach based on velocity moments (Sanchis, Lokas & Ma- 
mon 2004; Lokas et al. 2006). 

We find that the scale mass M s and the virial mass M v 
tend to be underestimated on average by 11 and 15 percent 
respectively. It is very likely that this bias is caused by small 
velocity interlopers, as described in Biviano et al. (2006), 
which are not tractable by algorithms for interloper removal 
and effectively decrease the velocity dispersion of a system. 
Interestingly, a similar offset in the virial mass estimates is 
noticed in the analysis of mock X-ray data of clusters (e.g. 
Ameglio et al. 2009; Nagai et al. 2007). However, this hap- 
pens probably incidentally, since these authors claim that 
their bias is due to the lack of hydrostatic equilibrium in 
the outer parts of clusters. 

The constraints on the anisotropy profile of a single 
cluster are barely conclusive. The reason for this is the pres- 



© 0000 RAS, MNRAS 000, 000-000 



The mass and anisotropy profiles of galaxy clusters 11 



ence of substructures at large clustercentric distances which 
gives rise to significant systematic errors of p x . The final 
effect is that, although the central asymptotic anisotropy 
is determined rather precisely, the overall constraint on the 
anisotropy profile is rather poor. This situation changes in 
the case of a joint analysis of several phase space diagrams. 
We find that it is then easy to obtain reliable constraints on 
the radial variation of the anisotropy within the radial range 
of two orders of magnitude around the scale radius r s . Note 
that we adopted the phase space scaling that is consistent 
with the intrinsic parameters of the mass profile: M s and r s , 
in contrast to commonly used assumption that a sample of 
phase space diagrams is homologous with respect to scaling 
by the virial radius and virial velocity (e.g. van der Marel et 
al. 2000; Lokas ct al. 2006). 

In this work we demonstrated the potential and dis- 
cussed the reliability of the analysis of kinematic data of 
galaxy clusters in terms of an anisotropic phase space den- 
sity model. The method is able to provide robust constraints 
on the parameters of the total mass profile and the mean 
anisotropy profile. The results could certainly contribute to 
tests of the mass-concentration relation as well as improve 
constraints on the mean anisotropy profile of galaxy clusters 
(e.g. van der Marel et al. 2000; Biviano & Katgert 2004; 
Lokas et al. 2006). This will be the subject of a follow-up 
work where we analyze real kinematic data for a sample of 
~ 20 nearby galaxy clusters. 
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